A game-theoretic analysis of Wikipedia’s peer production: The interplay between community’s governance and contributors’ interactions

Peer production, such as the collaborative authoring of Wikipedia articles, involves both cooperation and competition between contributors. Cooperatively, Wikipedia’s contributors attempt to create high-quality articles, and at the same time, they compete to align Wikipedia articles with their personal perspectives and “take ownership” of the article. This process is governed collectively by the community, which works to ensure the neutrality of the content. We study the interplay between individuals’ cooperation and competition, considering the community’s endeavor to ensure a neutral point of view (NPOV) on articles. We develop a two-level game-theoretic model: the first level models the interactions between individual contributors who seek both cooperative and competitive goals and the second level models governance of co-production as a Stackelberg (leader-follower) game between contributors and the communal neutrality-enforcing mechanisms. We present our model’s predictions regarding the relationship between contributors’ personal benefits of content ownership and their characteristics, namely their cooperative/competitive orientation and their activity profile (whether creators or curators of content). We validate the model’s prediction through an empirical analysis, by studying the interactions of 219,811 distinct contributors that co-produced 864 Wikipedia articles over a decade. The analysis and empirical results suggest that the factor that determines who ends up owning content is the ratio between one’s cooperative/competitive orientation (estimated based on whether a core or peripheral community member) and the contributor’s creator/curator activity profile (proxied through average edit size per sentence). Namely, under the governance mechanisms, the fractional content that is eventually owned by a contributor is higher for curators that have a competitive orientation. Although neutrality-seeking mechanisms are essential for ensuring that ownership is not concentrated within a small number of contributors, our findings suggest that the burden of excessive governance may deter contributors from participating, and thus indirectly curtail the peer production of high-quality articles.

We thank the reviewer for the detailed reading of the paper and their constructive feedback. The reviewer's additional comments helped improve the paper's quality, and we are grateful for that.
Does the quality (sum of all xi) need a multiplier so that the relative magnitude of quality and the fractional ownership can be better accounted for? I am worried that the way the utility function is modeled in (2), the quality may have an outsized impact on the overall quality compared to the fractional ownership, given that perhaps the magnitude of the argument for quality is going to be significantly larger than the fractional ownership argument. I am not sure if this impacts anything, and perhaps it doesn't, but just a quick confirmation that this is not driving the results would help.
We thank the reviewer for raising this important issue. We have tried to address it both empirically and analytically.
We begin the description of the empirical analysis by recalling that our model specifies two types of contributors -the vast majority of contributors with a competitive orientation (wi close to 1) and the other with a cooperative orientation (wi close to 0) -each deriving distinct benefit from participating in Wikipedia's collaborative authoring process. Specifically, the "benefit" side of our utility function includes two components, the competitive benefit (i.e. fractional ownership) is weighted by wi and the cooperative benefit (i.e. high-quality article) is weighted by 1-wi. In a sense, these weights work as a scaling factor.
Empirically, we find that these weights keep the values for these two components comparable. For example, for curators with a competitive orientation whose fractional ownership was more than 0.5, the net utility was of the order of 1.2. That is, although the cooperative benefit (Sxi) is increasing the utility function, it is not out-sizing the competitive benefit (i.e. fractional ownership) by orders of magnitude. For cooperative contributors, on the other hand, the cooperative benefit does dominate the competitive benefit, but these cooperative contributors do not own much content (as our results have already shown before), and thus this imbalance between the two benefit components does not affect the pattern of results.
We also performed an additional analysis by including a scaling factor q<<1. Our findings show that this did change the analytical results. Whereas there was a change in the Nash equilibrium for content ownership (namely, S(Lbi-1) was replaced by S(Lbi-q)), we found that this change had very little effect on our numerical results, since Lbi dominated in the final expression. We also note that our asymptotic results hold even when including this scaling factor.
We considered whether to include the scaling factor in the revised manuscript, but opted to leave it out, given that it complicates the model, yet adds no additional insight. After all, modelling should strive for parsimony.
That being said, if R2 thinks that it is better to include the scaling factor in our conceptualization, we are happy to do so in the next iteration.
The asymptotic results in lines 574-587 states that the result are in "the presence of Wikipedia's governance to ensure neutrality", however, the result from Theorem 4 seems to hold irrespective of t. The authors need to clarify that this is just because of the costs and benefits through the utility function other than t that those users with low edit sizes and high competitiveness end up owning more share. It might be nice to add a bit of intuition here based on not just the asymptotic case, but also the general case in Theorem 3 on why this happens.
We agree with the reviewer that the asymptotic result (Theorem 4) does not depend on t, and we are grateful to the reviewer for pushing us to clarify this issue.
First, to place the comment in context, we note that in this part of the Results section we present our findings and only briefly allude to the implications. Later, in the Discussion section, we interpret the study's findings in length and elaborate on their implications.
As mentioned by the reviewer, the content ownership (Equation (6) from theorem 3) does depend on the governance level, t. Since Equation (6) is too complex to intuitively explain, we chose to provide the intuitive reasoning for the asymptotic result (i.e. when the number of contributors becomes very large), as presented in Theorem 4. Hence, when describing the asymptotic result, we removed the sentence that stated this result is "in the presence of governance".
Nonetheless, we stress that the fact that t falls out in the derivation that yields the equation in Theorem 4 (Equation (8)) does not mean that governance levels do not affect the general pattern of results. As we demonstrate later in the paper, governance levels do in fact influence the cooperation/competition dynamics (e.g. excessive levels of governance bring the process to halt).
In the Discussion section (specifically, Section 6, line 856 in the revised manuscript) we discuss the way in which governance policies that enforce neutrality influence our empirical findings (namely, curators with competitive orientation ending up owning more content).
I think I didn't properly explain what I meant before when I asked the authors to take a look at the plot that has w and \beta on two dimensions. I was asking for the plot from the game-theoretic model on these two. Basically, for a given parameter set, a plot can be drawn to show for example that those users with w/\beta < E[W/\Beta] own a non-significant portion of the content, whereas those with w/\beta > E[W/\Beta] do not own any of the content per Theorem 4 (or perhaps find a way to discuss the findings in Theorem 3?).
The revised manuscript includes a newly-added line that specifies E(b/w) on the plots, as shown below. Now it is more clear to see that only users with a ration that is larger than the group's average end up owning content. Our intention for that section is to describe the state of the art, with a focus on the angles most pertinent to our study's research question.
In the Discussion section, once the study's findings have been presented, we come back to the different streams within the literature which were reviewed earlier, and discuss how our findings inform and contribute to each of these streams. Whereas there is not a 1:1 mapping between the structure of the Literature Review and the Discussion sections, we did strive to make linkages. In particular: • The 3rd paragraph (with starts with "We note that prior studies have discussed individual contributors' attempts to influence articles' contents …") discusses our contribution to the literature on the competition in Wikipedia's content production (i.e. the attempt to "own" article portions). • The paragraph that follows (beginning with "The finding from the non-cooperative game that a competitive orientation yields …" discuss our paper's contribution to the literature on contributors' roles, specifically creators and curators. • The next paragraph (which begins with "A second powerful result of this study is in demonstrating that") discusses our contribution to the study of Wikipedia's governance, specifically in highlighting some of the negative side-effects that are associated with excessive levels of governance.
When introduction \betai, clarify the continuum for it: does high \beta mean creator or curator? This is formally clarified much later on line 567.
Once again, the reviewer has raised a good point. We have kept the explanation of b being a continuum, both in the introduction and in the analysis section.
The y-axis on Figure 4 goes from 0 to 10^0, perhaps better to note it as 0 to 1?
This is an artifact of Matlab. When we plot on linear scale the variation is not visible. Hence we used a semi-log scale and therefore, the label appears as 10^0.
Research question may be better provided not in a section of its own, though this is subjective and I let the authors decide whether to change anything about this.
We have considered integrating the section on research question into the literature review chapter (i.e. making it Section 2.4), but we think that it is more appropriate to leave it as a separate (short) section.
Writing: I found a couple of issues, you may want to have another proof-reading before the final version o On line 430, the "a" before "either" should be removed and there is an extra space after the "(". o On line 574, it should be "who ARE creators of content".
We now have carefully proofread the paper and corrected all grammatical and typographical errors.